Last updated: 2020-11-23

Checks: 6 1

Knit directory: factor_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200623) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_ageco.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_ageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k_ageco.rdata output/info_pval5e8_pliercanon_ld_d1k_ageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv27top025_pliercanon.rdata output/lv27top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv76top025_pliercanon.rdata output/lv76top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv90top025_pliercanon.rdata output/lv90top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv49top025_pliercanon.rdata output/lv49top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv82top025_pliercanon.rdata output/lv82top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv36top025_pliercanon.rdata output/lv36top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv39top025_pliercanon.rdata output/lv39top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv6top025_pliercanon.rdata output/lv6top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv119top025_pliercanon.rdata output/lv119top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv23top025_pliercanon.rdata output/lv23top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv26top025_pliercanon.rdata output/lv26top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv125top025_pliercanon.rdata output/lv125top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lv47top025_pliercanon.rdata output/lv47top025_pliercanon.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_53759010.coloc_pliercanon_d1k_500.rdata output/bmi_27_53759010.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_85758440.coloc_pliercanon_d1k_500.rdata output/bmi_27_85758440.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_40380914.coloc_pliercanon_d1k_500.rdata output/bmi_76_40380914.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_43842728.coloc_pliercanon_d1k_500.rdata output/bmi_76_43842728.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_160687504.coloc_pliercanon_d1k_500.rdata output/LDL_125_160687504.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_44943964.coloc_pliercanon_d1k_500.rdata output/LDL_125_44943964.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56955678.coloc_pliercanon_d1k_500.rdata output/LDL_125_56955678.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56965346.coloc_pliercanon_d1k_500.rdata output/LDL_125_56965346.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_41686157.coloc_pliercanon_d1k_500.rdata output/lymph_26_41686157.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_59258463.coloc_pliercanon_d1k_500.rdata output/lymph_26_59258463.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_34414160.coloc_pliercanon_d1k_500.rdata output/plt_49_34414160.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_56786080.coloc_pliercanon_d1k_500.rdata output/plt_49_56786080.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_88978375.coloc_pliercanon_d1k_500.rdata output/plt_49_88978375.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_32506416.coloc_pliercanon_d1k_500.rdata output/rbc_82_32506416.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_42718545.coloc_pliercanon_d1k_500.rdata output/rbc_82_42718545.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_54482753.coloc_pliercanon_d1k_500.rdata output/rbc_82_54482753.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_8614076.coloc_pliercanon_d1k_500.rdata output/rbc_82_8614076.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_129655816.coloc_pliercanon_d1k_500.rdata output/wbc_6_129655816.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_81575948.coloc_pliercanon_d1k_500.rdata output/wbc_6_81575948.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_16589041.coloc_pliercanon_d1k_500.rdata output/wbc_119_16589041.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_31355703.coloc_pliercanon_d1k_500.rdata output/wbc_119_31355703.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_53955989.coloc_pliercanon_d1k_500.rdata output/WHR_47_53955989.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_6742916.coloc_pliercanon_d1k_500.rdata output/WHR_47_6742916.coloc_pliercanon_d1k_500.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_53759010.coloc_pliercanon_d1k.rdata output/bmi_27_53759010.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_27_85758440.coloc_pliercanon_d1k.rdata output/bmi_27_85758440.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_40380914.coloc_pliercanon_d1k.rdata output/bmi_76_40380914.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/bmi_76_43842728.coloc_pliercanon_d1k.rdata output/bmi_76_43842728.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_160687504.coloc_pliercanon_d1k.rdata output/LDL_125_160687504.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_44943964.coloc_pliercanon_d1k.rdata output/LDL_125_44943964.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56955678.coloc_pliercanon_d1k.rdata output/LDL_125_56955678.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/LDL_125_56965346.coloc_pliercanon_d1k.rdata output/LDL_125_56965346.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_41686157.coloc_pliercanon_d1k.rdata output/lymph_26_41686157.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/lymph_26_59258463.coloc_pliercanon_d1k.rdata output/lymph_26_59258463.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_34414160.coloc_pliercanon_d1k.rdata output/plt_49_34414160.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_56786080.coloc_pliercanon_d1k.rdata output/plt_49_56786080.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/plt_49_88978375.coloc_pliercanon_d1k.rdata output/plt_49_88978375.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_32506416.coloc_pliercanon_d1k.rdata output/rbc_82_32506416.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_42718545.coloc_pliercanon_d1k.rdata output/rbc_82_42718545.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_54482753.coloc_pliercanon_d1k.rdata output/rbc_82_54482753.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/rbc_82_8614076.coloc_pliercanon_d1k.rdata output/rbc_82_8614076.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_129655816.coloc_pliercanon_d1k.rdata output/wbc_6_129655816.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_6_81575948.coloc_pliercanon_d1k.rdata output/wbc_6_81575948.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_16589041.coloc_pliercanon_d1k.rdata output/wbc_119_16589041.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/wbc_119_31355703.coloc_pliercanon_d1k.rdata output/wbc_119_31355703.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_53955989.coloc_pliercanon_d1k.rdata output/WHR_47_53955989.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/WHR_47_6742916.coloc_pliercanon_d1k.rdata output/WHR_47_6742916.coloc_pliercanon_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k.rdata output/info_pval5e8_pliercanon_ld_d1k.rdata
/project2/xinhe/xsun/website/factor_analysis/output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_noageco.rdata output/sum_sep_tog_pval_pliercanon_wgpc_ldcl_d1k_noageco.rdata
/project2/xinhe/xsun/website/factor_analysis/output/info_pval5e8_pliercanon_ld_d1k_noageco.rdata output/info_pval5e8_pliercanon_ld_d1k_noageco.rdata

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 691daa9. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    analysis/.Rhistory

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd) and HTML (docs/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 691daa9 XSun 2020-11-23 update
html c729bb2 XSun 2020-11-21 Build site.
html fc29732 XSun 2020-11-21 Build site.
Rmd 50faac0 XSun 2020-11-21 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 5ffe456 XSun 2020-11-20 Build site.
Rmd 1349673 XSun 2020-11-20 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 182a46e XSun 2020-11-20 Build site.
Rmd 1f761b4 XSun 2020-11-20 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd e947318 XSun 2020-11-19 update
html 36b155f XSun 2020-11-18 Build site.
Rmd 8dcca15 XSun 2020-11-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html dead226 XSun 2020-11-18 Build site.
Rmd 818deca XSun 2020-11-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd 0a5f8b5 XSun 2020-11-18 update
html da71804 XSun 2020-11-02 Build site.
Rmd 02226ad XSun 2020-11-02 update
Rmd 377743f XSun 2020-11-02 update
html fb88efd XSun 2020-10-29 Build site.
Rmd bcb9478 XSun 2020-10-29 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 1ea5535 XSun 2020-10-29 Build site.
Rmd b26e318 XSun 2020-10-29 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 23966ec XSun 2020-10-28 Build site.
Rmd 5346342 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 9f1cff7 XSun 2020-10-28 Build site.
Rmd da59019 XSun 2020-10-28 update
html 77a2489 XSun 2020-10-28 Build site.
Rmd cd10457 XSun 2020-10-28 update
Rmd f265455 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html a14cfaa XSun 2020-10-28 Build site.
Rmd 58fbdd2 XSun 2020-10-28 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 8e64234 XSun 2020-10-24 Build site.
Rmd ef67b09 XSun 2020-10-24 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 4793c9c XSun 2020-10-24 Build site.
Rmd 05cf21b XSun 2020-10-24 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 467f893 XSun 2020-10-24 Build site.
Rmd 021b0b4 XSun 2020-10-24 update
html 6b65c0c XSun 2020-10-22 Build site.
Rmd d13e337 XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 8ba3ee9 XSun 2020-10-22 Build site.
Rmd aa16fcb XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html baa1f02 XSun 2020-10-22 Build site.
Rmd 2dd03ae XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 36ea64d XSun 2020-10-22 Build site.
Rmd 19694ce XSun 2020-10-22 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd 6f6c907 XSun 2020-10-22 update
html 1cc9328 XSun 2020-10-22 Build site.
Rmd b94cb1b XSun 2020-10-22 update
html 00f41ae XSun 2020-10-21 Build site.
Rmd 312defa XSun 2020-10-21 update
Rmd 9505772 XSun 2020-10-19 update
html 3567877 XSun 2020-10-19 Build site.
Rmd db41780 XSun 2020-10-19 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
Rmd b197942 XSun 2020-10-19 update
html 3250458 XSun 2020-10-19 Build site.
Rmd d04088b XSun 2020-10-19 update
html 3d0c708 XSun 2020-10-19 Build site.
Rmd 2315fbc XSun 2020-10-19 update
html acfd22d XSun 2020-10-18 Build site.
Rmd 6218b87 XSun 2020-10-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 1327f34 XSun 2020-10-18 Build site.
Rmd bc3dc2d XSun 2020-10-18 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html f033bab XSun 2020-10-18 Build site.
Rmd 9535d29 XSun 2020-10-18 update
html d11838b XSun 2020-10-16 Build site.
Rmd bddde1b XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 39c0f34 XSun 2020-10-16 Build site.
Rmd 73f6e34 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html e1e20c9 XSun 2020-10-16 Build site.
Rmd c10d340 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 83c7381 XSun 2020-10-16 Build site.
Rmd 8407ce3 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 5769bae XSun 2020-10-16 Build site.
Rmd a13fb68 XSun 2020-10-16 update
html 6c0de70 XSun 2020-10-16 Build site.
html 00d36e3 XSun 2020-10-16 Build site.
Rmd a615ad1 XSun 2020-10-16 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 28aea11 XSun 2020-10-15 Build site.
Rmd fa849c5 XSun 2020-10-15 wflow_publish(“analysis/catalog_gwas_pliercanon_sep_ldclumping_r01d1000.Rmd”)
html 0046ff2 XSun 2020-10-14 Build site.
Rmd 14e53c8 XSun 2020-10-14 update

Introduction

In this part, I considered the traits separately. I selected the SNPs with pval < 5e-8 for each traits. Then, I did LD Clumping for these SNPs to eliminate the LD and select a smaller subset of SNPs. After that, I did association tests for the plier_canonical factors with the SNPs. I also did colozalization analysis for some significant factor~snp pairs.

Material and Methods

  1. Catalog GWAS data:
  • Platelet count, white blood cell count, myeloid white cell count, lymphocyte counts, red blood cell count, granulocyte count, eosinophil count, neutrophil count from Astle WJ, Elding H, Jiang T, et al. The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease. Cell. 2016;167(5):1415-1429.e19. doi:10.1016/j.cell.2016.10.042.

  • T2D. I first used data from our lab collaction Morris et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet. 2012 Sep;44(9):981-90. doi: 10.1038/ng.2383. Epub 2012 Aug 12. PMID: 22885922; PMCID: PMC3442244. but it doesn’t contain MAF info of variants.So I added this from GWAS Catalog: Wood AR et al. Variants in the FTO and CDKAL1 loci have recessive effects on risk of obesity and type 2 diabetes, respectively. Diabetologia. 2016 Jun;59(6):1214-21. doi: 10.1007/s00125-016-3908-5. Epub 2016 Mar 10. PMID: 26961502; PMCID: PMC4869698..

  • Asthma. I first used data from our lab collaction Zhu et al, Shared Genetics of Asthma and Mental Health Disorders: A Large-Scale Genome-Wide Cross-Trait Analysis. European Respiratory Journal, 2019 (PMID: 31619474) but it doesn’t contain MAF info of variants.So I added this from GWAS Catalog: Manuel A.R. et al. Genetic Architectures of Childhood- and Adult-Onset Asthma Are Partly Distinct,The American Journal of Human Genetics,Volume 104, Issue 4,2019,Pages 665-684,ISSN 0002-9297,https://doi.org/10.1016/j.ajhg.2019.02.022..

  • IBD,Ulcerative colitist,Crohn’s disease data are from lab collection Liu, van Sommeren et al, Nature Genetics, 2015

  • Waist-hip ratio data are from Shungin et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature. 2015 Feb 12;518(7538):187-196. doi: 10.1038/nature14132. PMID: 25673412; PMCID: PMC4338562.

  • BMI data are also from lab collection: Locke AE, Kahali B, Berndt SI, Justice AE, Pers TH, Day FR, Powell C, Vedantam S, Buchkovich ML, Yang J, Croteau-Chonka DC, Esko T et al. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197-206

  • HDL & LDL data are downloaded from GWAS Catalog. Global Lipids Genetics Consortium., Willer, C., Schmidt, E. et al. Discovery and refinement of loci associated with lipid levels. Nat Genet 45, 12741283 (2013). https://doi.org/10.1038/ng.2797

  1. Filtered the SNPs using pval < 5e-8 as cut off from the GWAS Catalog data.

  2. Did LD Clumping for the SNPs in step 2. The PLINK LD Clumping patameters are:

    –clump-p1 0.0001 Significance threshold for index SNPs

    –clump-p2 0.01 Secondary significance threshold for clumped SNPs

    –clump-r2 0.1 LD threshold for clumping

    –clump-kb 1000 Physical distance threshold for clumping

For each trait, I got a subset of SNPs that are not in LD with each other.

  1. Did association tests for plier_canonical factors and SNPs in 3. The association tests were corrected by 1)10 genotype PCs of whole genome; 2)10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex; 3)10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex + AGE

  2. For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

  3. For the traits and LVs in 5, I made an info table to show more details of the SNPs.

  4. For several LVs we are interested in, I did gene set enrichment analysis to test if the LVs are correlated with some KEGG/REACTOME pathways. I used two kind gene sets to do GSEA: 1. genes that used to compute LVs; 2. Sorting the genes in 1 by their loadings, take the top 25% as the gene set. For both gene sets, the gene scores used as input of GSEA are the gene loadings.

  5. Resampling. For some promising trait-factor pairs, I did resampling. I resampled the SNPs without replacement, I fitted the points with intercept = 0 again and recorded the pvalues and r-squared. The resampling was repeated 1000 times. The following plots are the resampling results.

Snps after filter

After filtering by ‘pval < 5e-8’ and LD Clumping, for each trait, I got :

platelet count trait contains 688 SNPs with pval<5e-8.

white blood cell count trait contains 368 SNPs with pval<5e-8.

myeloid white cell count trait contains 319 SNPs with pval<5e-8.

lymphocyte count trait contains 436 SNPs with pval<5e-8.

red blood cell count trait contains 466 SNPs with pval<5e-8.

granulocyte count trait contains 316 SNPs with pval<5e-8.

eosinophil count trait contains 491 SNPs with pval<5e-8.

neutrophil count trait contains 317 SNPs with pval<5e-8.

IBD trait contains 116 SNPs with pval<5e-8.

Ulcerative colitist trait contains 73 SNPs with pval<5e-8.

Crohn’s disease trait contains 96 SNPs with pval<5e-8.

BMI trait contains 104 SNPs with pval<5e-8.

T2D contains 14 SNPs with pval<5e-8. T2D_2 contains 4 SNPs with pval<5e-8.

Asthma trait contains 186 SNPs with pval<5e-8. Asthma_2 trait contains 112 SNPs with pval<5e-8.

HDL trait contains 227 SNPs with pval<5e-8.

LDL trait contains 204 SNPs with pval<5e-8.

WHR trait contains 36 SNPs with pval<5e-8.

Results - pval < 5e-8 & association test covariants: 10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex + AGE

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’. The ‘num_significant_pairs’ indicates the number of significant pairs under each threshold. If a trait~factor pair has as least 1 significant SNP, we named it as ‘significant pair’.

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC,T2D, asthma, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Enrichment analysis

For some promising trait-factor pairs (i.e. BMI-LV3, LV27, RBC-LV82, AsthmaLV68, WBC-LV119, Lymphocyte-LV23, LV78), I did enrichment analysis with WebGestalt. The analysis are under different settings:

  1. ORA: For each factor, I sorted the genes that have non-zero loadings by their loadings, taking the top 25% as the gene set. The function database here are Reactome pathway; Disgenet + GLAD4U + OMIM disease dataset; geneontology biological process. The reference set affy hugene 2 0 st v1. Minimum number of genes for a category is 5, maximum number of genes for a category is 2000(default settings). All categories that have fdr<0.01 are listed.

BMI-LV27

BMI-LV76

BMI-LV90

PLT-LV49

RBC-LV82

Asthma-LV36

Asthma-LV39

WBC-LV6

Warning in instance$preRenderHook(instance): It seems your data is too big
for client-side DataTables. You may consider server-side processing: https://
rstudio.github.io/DT/server.html

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV26

LDL-LV125

WHR

Effect size plots

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

I also relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

Asthma_2

BMI

Eosinophil count

Crohn’s disease

IBD

Ulcerative colitist

LDL

granulocyte count

None of the LVs have >1 SNPs at FDR<0.2.

lymphocyte count

myeloid white cell count

neutrophil count

plt

rbc

T2D_2

T2D

asthma

wbc

WHR

Effect size plots - checking the reverse causality

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

LDL

lymphocyte count

platelet count

red blood cell count

asthma

white blood cell count

WHR

QQplots

BMI

LV27, LV76 and LV90

LDL

LV125


platelet count

LV49

red blood cell count

LV82

asthma

LV36 and LV79

white blood cell count

LV6 and LV119

WHR

LV47

Colocalization – 200kb

The colocalization analysis was performed using the approximate Bayes factor test implemented in the Coloc package. Coloc computes five posterior probabilities (PP0, PP1, PP2, PP3 and PP4), each corresponding to a hypothesis: H0, no association with either trait; H1, association with trait 1 but not with trait 2; H2, association with trait 2 but not with trait 1; H3, association with trait 1 and trait 2, two independent SNPs; H4, association with trait 1 and trait 2, one shared SNP. We ran Coloc with the default parameters and used PP4 to assess evidence of colocalization. We visualized the colocalization of factor - QTLs and GWAS associations using the LocusCompareR package.

SNP selection: 1. Chose the SNPs in the info table. 2. For each SNP, the region used in colocalization analysis is between [pos-100kb, pos+100kb]. 3. All SNPs in this region are included in alalysis.

BMI

LV27

PPs in coloclization analysis
note
nsnps 499 NA
PP.H0.abf 1.35039761367912e-145 no association with either trait
PP.H1.abf 7.86241579658514e-147 association with trait 1 but not with trait 2
PP.H2.abf 0.833675789922105 association with trait 2 but not with trait 1
PP.H3.abf 0.0484211763716254 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.117903033706277 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 458 NA
PP.H0.abf 1.00714955121074e-08 no association with either trait
PP.H1.abf 1.07025940120869e-09 association with trait 1 but not with trait 2
PP.H2.abf 0.841859094244127 association with trait 2 but not with trait 1
PP.H3.abf 0.0893924054203391 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0687484891937805 association with trait 1 and trait 2, one shared SNP

LV76

PPs in coloclization analysis
note
nsnps 750 NA
PP.H0.abf 0.0428105191769995 no association with either trait
PP.H1.abf 0.00481701109736574 association with trait 1 but not with trait 2
PP.H2.abf 0.811244208573467 association with trait 2 but not with trait 1
PP.H3.abf 0.0912307603958566 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.049897500756311 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 551 NA
PP.H0.abf 0.0427388368431629 no association with either trait
PP.H1.abf 0.0223982915556723 association with trait 1 but not with trait 2
PP.H2.abf 0.588272582852666 association with trait 2 but not with trait 1
PP.H3.abf 0.308259737393146 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0383305513553527 association with trait 1 and trait 2, one shared SNP

lymphocyte count

LV26

pvalues in coloclization
note
nsnps 1901 NA
PP.H0.abf 0.551400357412064 no association with either trait
PP.H1.abf 0.350390080527746 association with trait 1 but not with trait 2
PP.H2.abf 0.0485827272932968 association with trait 2 but not with trait 1
PP.H3.abf 0.0308533605382535 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0187734742286402 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1534 NA
PP.H0.abf 0.624231586783693 no association with either trait
PP.H1.abf 0.29691846185701 association with trait 1 but not with trait 2
PP.H2.abf 0.0454737736049874 association with trait 2 but not with trait 1
PP.H3.abf 0.0216180395193858 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0117581382349238 association with trait 1 and trait 2, one shared SNP

platelet count

LV49

pvalues in coloclization
note
nsnps 1532 NA
PP.H0.abf 0.00028384352554766 no association with either trait
PP.H1.abf 9.53409302315354e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.721638825542442 association with trait 2 but not with trait 1
PP.H3.abf 0.242357492575299 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0356244974264805 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1370 NA
PP.H0.abf 1.8417589323712e-05 no association with either trait
PP.H1.abf 4.40424788747121e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.0468244133445686 association with trait 2 but not with trait 1
PP.H3.abf 0.111130501900955 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.841982624686278 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1986 NA
PP.H0.abf 0.00027523983668062 no association with either trait
PP.H1.abf 7.3056726176907e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.69977172359009 association with trait 2 but not with trait 1
PP.H3.abf 0.185625687327869 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.114254292519184 association with trait 1 and trait 2, one shared SNP

red blood cell count

LV82

pvalues in coloclization
note
nsnps 2597 NA
PP.H0.abf 0.000552401715551205 no association with either trait
PP.H1.abf 0.000142894163020125 association with trait 1 but not with trait 2
PP.H2.abf 0.767428590867418 association with trait 2 but not with trait 1
PP.H3.abf 0.198483489352633 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0333926239013786 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 2106 NA
PP.H0.abf 0.000534082743566863 no association with either trait
PP.H1.abf 0.000147333853948397 association with trait 1 but not with trait 2
PP.H2.abf 0.741967762385206 association with trait 2 but not with trait 1
PP.H3.abf 0.204628989484599 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0527218315326792 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1773 NA
PP.H0.abf 0.000532066325912198 no association with either trait
PP.H1.abf 0.000162142075422079 association with trait 1 but not with trait 2
PP.H2.abf 0.739156701624836 association with trait 2 but not with trait 1
PP.H3.abf 0.225215934691157 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0349331552826738 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1733 NA
PP.H0.abf 0.000536897512154383 no association with either trait
PP.H1.abf 0.000146989117164231 association with trait 1 but not with trait 2
PP.H2.abf 0.745867436555273 association with trait 2 but not with trait 1
PP.H3.abf 0.20415056040199 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0492981164134186 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV6

pvalues in coloclization
note
nsnps 1692 NA
PP.H0.abf 2.42069422133774e-20 no association with either trait
PP.H1.abf 5.85832479206917e-21 association with trait 1 but not with trait 2
PP.H2.abf 0.752317669664185 association with trait 2 but not with trait 1
PP.H3.abf 0.18200280383898 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0656795264968316 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 2527 NA
PP.H0.abf 0.664208360811174 no association with either trait
PP.H1.abf 0.205620211187087 association with trait 1 but not with trait 2
PP.H2.abf 0.0885804081235787 association with trait 2 but not with trait 1
PP.H3.abf 0.027407817633705 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0141832022444551 association with trait 1 and trait 2, one shared SNP

LV119

pvalues in coloclization
note
nsnps 1473 NA
PP.H0.abf 0.596714698368467 no association with either trait
PP.H1.abf 0.305368562821701 association with trait 1 but not with trait 2
PP.H2.abf 0.0522979566455676 association with trait 2 but not with trait 1
PP.H3.abf 0.0267445889878615 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0188741931764027 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 514 NA
PP.H0.abf 0.926358700221567 no association with either trait
PP.H1.abf 0.0498079011502066 association with trait 1 but not with trait 2
PP.H2.abf 0.0206041219700063 association with trait 2 but not with trait 1
PP.H3.abf 0.00110570655025839 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00212357010796139 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV47

pvalues in coloclization
note
nsnps 433 NA
PP.H0.abf 1.15537632748376e-06 no association with either trait
PP.H1.abf 6.97799012112253e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.899698351085931 association with trait 2 but not with trait 1
PP.H3.abf 0.0542920116470383 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0460084121108017 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 807 NA
PP.H0.abf 1.75215744541055e-07 no association with either trait
PP.H1.abf 1.83287532668058e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.850469002019758 association with trait 2 but not with trait 1
PP.H3.abf 0.088904189531708 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0606266149040363 association with trait 1 and trait 2, one shared SNP

Colocalization – 200kb

The colocalization analysis was performed using the approximate Bayes factor test implemented in the Coloc package. Coloc computes five posterior probabilities (PP0, PP1, PP2, PP3 and PP4), each corresponding to a hypothesis: H0, no association with either trait; H1, association with trait 1 but not with trait 2; H2, association with trait 2 but not with trait 1; H3, association with trait 1 and trait 2, two independent SNPs; H4, association with trait 1 and trait 2, one shared SNP. We ran Coloc with the default parameters and used PP4 to assess evidence of colocalization. We visualized the colocalization of factor - QTLs and GWAS associations using the LocusCompareR package.

SNP selection: 1. Chose the SNPs in the info table. 2. For each SNP, the region used in colocalization analysis is between [pos-100kb, pos+100kb]. 3. All SNPs in this region are included in alalysis.

BMI

LV27

PPs in coloclization analysis
note
nsnps 207 NA
PP.H0.abf 1.31495360908805e-145 no association with either trait
PP.H1.abf 5.2049528767862e-147 association with trait 1 but not with trait 2
PP.H2.abf 0.811794228353762 association with trait 2 but not with trait 1
PP.H3.abf 0.0319768491954829 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.156228922450753 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 177 NA
PP.H0.abf 6.75116568372173e-09 no association with either trait
PP.H1.abf 5.40587281938705e-10 association with trait 1 but not with trait 2
PP.H2.abf 0.564285233583145 association with trait 2 but not with trait 1
PP.H3.abf 0.0447931895827552 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.390921569542346 association with trait 1 and trait 2, one shared SNP

LV76

PPs in coloclization analysis
note
nsnps 305 NA
PP.H0.abf 0.043354512024562 no association with either trait
PP.H1.abf 0.00365449985190367 association with trait 1 but not with trait 2
PP.H2.abf 0.820968372095008 association with trait 2 but not with trait 1
PP.H3.abf 0.0691393439964714 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.062883272032055 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 198 NA
PP.H0.abf 0.049007904441049 no association with either trait
PP.H1.abf 0.0227480522824381 association with trait 1 but not with trait 2
PP.H2.abf 0.519650730686723 association with trait 2 but not with trait 1
PP.H3.abf 0.241039290569512 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.167554022020278 association with trait 1 and trait 2, one shared SNP

LDL

LV125

PPs in coloclization analysis
note
nsnps 159 NA
PP.H0.abf 1.65803672342948e-09 no association with either trait
PP.H1.abf 1.12233006918158e-10 association with trait 1 but not with trait 2
PP.H2.abf 0.620560571711587 association with trait 2 but not with trait 1
PP.H3.abf 0.0416681613863381 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.337771265131805 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 280 NA
PP.H0.abf 8.66392262651223e-31 no association with either trait
PP.H1.abf 7.94897609482024e-31 association with trait 1 but not with trait 2
PP.H2.abf 0.499148954943497 association with trait 2 but not with trait 1
PP.H3.abf 0.457916269344939 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0429347757115665 association with trait 1 and trait 2, one shared SNP
PPs in coloclization analysis
note
nsnps 274 NA
PP.H0.abf 8.82484066440544e-31 no association with either trait
PP.H1.abf 8.09234684377751e-31 association with trait 1 but not with trait 2
PP.H2.abf 0.508419821490731 association with trait 2 but not with trait 1
PP.H3.abf 0.466193743656564 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0253864348527016 association with trait 1 and trait 2, one shared SNP

lymphocyte count

LV26

pvalues in coloclization
note
nsnps 804 NA
PP.H0.abf 0.845766444165032 no association with either trait
PP.H1.abf 0.110794734986824 association with trait 1 but not with trait 2
PP.H2.abf 0.0342314385498326 association with trait 2 but not with trait 1
PP.H3.abf 0.00447956354621296 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00472781875209829 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 586 NA
PP.H0.abf 0.681501963451757 no association with either trait
PP.H1.abf 0.274861949970861 association with trait 1 but not with trait 2
PP.H2.abf 0.0202160229511401 association with trait 2 but not with trait 1
PP.H3.abf 0.0081382023353829 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0152818612908584 association with trait 1 and trait 2, one shared SNP

platelet count

LV49

pvalues in coloclization
note
nsnps 523 NA
PP.H0.abf 0.466538003013372 no association with either trait
PP.H1.abf 0.105846694770366 association with trait 1 but not with trait 2
PP.H2.abf 0.328747014945193 association with trait 2 but not with trait 1
PP.H3.abf 0.0745607953782274 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0243074918928418 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 474 NA
PP.H0.abf 8.1176654299053e-08 no association with either trait
PP.H1.abf 9.36336316698695e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.099414413602958 association with trait 2 but not with trait 1
PP.H3.abf 0.11388336740316 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.786702044183596 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 945 NA
PP.H0.abf 0.000323941065497684 no association with either trait
PP.H1.abf 5.35834117434399e-05 association with trait 1 but not with trait 2
PP.H2.abf 0.823567437081277 association with trait 2 but not with trait 1
PP.H3.abf 0.136187235765739 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0398678026757416 association with trait 1 and trait 2, one shared SNP

red blood cell count

LV82

pvalues in coloclization
note
nsnps 932 NA
PP.H0.abf 0.862160613876924 no association with either trait
PP.H1.abf 0.095492523950811 association with trait 1 but not with trait 2
PP.H2.abf 0.0345160944404618 association with trait 2 but not with trait 1
PP.H3.abf 0.00381897538892352 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00401179234287936 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 1007 NA
PP.H0.abf 0.791255162135573 no association with either trait
PP.H1.abf 0.160499440153454 association with trait 1 but not with trait 2
PP.H2.abf 0.0352370317017519 association with trait 2 but not with trait 1
PP.H3.abf 0.00714166817009981 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0058666978391218 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 766 NA
PP.H0.abf 0.842378151311792 no association with either trait
PP.H1.abf 0.121767261460728 association with trait 1 but not with trait 2
PP.H2.abf 0.0270036088172732 association with trait 2 but not with trait 1
PP.H3.abf 0.00389846721795865 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00495251119224877 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 845 NA
PP.H0.abf 0.794937633920837 no association with either trait
PP.H1.abf 0.163102621759243 association with trait 1 but not with trait 2
PP.H2.abf 0.0291504707276119 association with trait 2 but not with trait 1
PP.H3.abf 0.00597416012226824 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00683511347003905 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV6

pvalues in coloclization
note
nsnps 790 NA
PP.H0.abf 2.55244831056944e-20 no association with either trait
PP.H1.abf 4.42587376275609e-21 association with trait 1 but not with trait 2
PP.H2.abf 0.793264984903673 association with trait 2 but not with trait 1
PP.H3.abf 0.137480668567092 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0692543465292343 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 925 NA
PP.H0.abf 0.827099341301274 no association with either trait
PP.H1.abf 0.123460953436092 association with trait 1 but not with trait 2
PP.H2.abf 0.0379056341098084 association with trait 2 but not with trait 1
PP.H3.abf 0.00565228464418128 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00588178650864373 association with trait 1 and trait 2, one shared SNP

LV119

pvalues in coloclization
note
nsnps 558 NA
PP.H0.abf 2.59596880772956e-17 no association with either trait
PP.H1.abf 7.9810396381393e-18 association with trait 1 but not with trait 2
PP.H2.abf 0.73055366654872 association with trait 2 but not with trait 1
PP.H3.abf 0.224556336071001 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0448899973802794 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 301 NA
PP.H0.abf 0.962205966613379 no association with either trait
PP.H1.abf 0.0231789040116609 association with trait 1 but not with trait 2
PP.H2.abf 0.013173262987228 association with trait 2 but not with trait 1
PP.H3.abf 0.000316209517610261 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.00112565687012208 association with trait 1 and trait 2, one shared SNP

white blood cell count

LV47

pvalues in coloclization
note
nsnps 173 NA
PP.H0.abf 1.15814920856251e-06 no association with either trait
PP.H1.abf 3.90974515028206e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.901857588140806 association with trait 2 but not with trait 1
PP.H3.abf 0.0303776514611794 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0677635631513558 association with trait 1 and trait 2, one shared SNP
pvalues in coloclization
note
nsnps 288 NA
PP.H0.abf 1.84200928897656e-07 no association with either trait
PP.H1.abf 1.29379118273315e-08 association with trait 1 but not with trait 2
PP.H2.abf 0.894081636109299 association with trait 2 but not with trait 1
PP.H3.abf 0.0627553770637426 association with trait 1 and trait 2,two independent SNPs
PP.H4.abf 0.0431627896881166 association with trait 1 and trait 2, one shared SNP

Results - pval < 5e-8 & association test covariants: 10 PCs

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’.

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Enrichment analysis

For some promising trait-factor pairs (i.e. BMI-LV3, LV27, RBC-LV82, AsthmaLV68, WBC-LV119, Lymphocyte-LV23, LV78), I did enrichment analysis with WebGestalt. The analysis are under different settings:

  1. ORA: For each factor, I sorted the genes in 1 by their loadings, taking the top 25% as the gene set. The function database here are Reactome pathway; Disgenet + GLAD4U + OMIM disease dataset; geneontology biological process. The reference set affy hugene 2 0 st v1. Minimum number of genes for a category is 5, maximum number of genes for a category is 2000(default settings). All categories that have fdr<0.01 are listed.

BMI-LV3

BMI-LV27

BMI-LV76

BMI-LV90

RBC-LV82

Asthma-LV68

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV78

  1. GSEA: For each factor, I used all genes (zero loading genes included), taking the gene loadings as their gene score in GSEA. The function database is Reactome pathway.

BMI-LV3-Reactome BMI-LV3-GO BMI-LV3-phenotype BMI-LV3-disease-Disgenet

BMI-LV27-Reactome BMI-LV27-GO BMI-LV27-phenotype BMI-LV27-disease-Disgenet

BMI-LV76-Reactome BMI-LV76-GO BMI-LV76-phenotype BMI-LV76-disease-Disgenet

RBC-LV82-Reactome RBC-LV82-GO RBC-LV82-phenotype RBC-LV82-disease-Disgenet

Asthma-LV68-Reactome Asthma-LV68-GO Asthma-LV68-phenotype Asthma-LV68-disease-Disgenet

WBC-LV119-Reactome WBC-LV119-GO WBC-LV119-phenotype WBC-LV119-disease-Disgenet

Lymphocyte-LV23-Reactome Lymphocyte-LV23-GO Lymphocyte-LV23-phenotype Lymphocyte-LV23-disease-Disgenet

Lymphocyte-LV78 Lymphocyte-LV78-GO Lymphocyte-LV78-phenotype Lymphocyte-LV78-disease-Disgenet

  1. GSEA: For each factor, I used all genes (zero loading genes included), taking the gene loadings as their gene score in GSEA. For those loadings = 0, I assigned a random number from normal distribution N(0,1e-5) to it to avoid ties. The function database is Reactome pathway.

BMI-LV3

BMI-LV27

BMI-LV76

RBC-LV82

Asthma-LV68

WBC-LV119

Lymphocyte-LV23

Lymphocyte-LV78

Effect size plots

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

BMI

Eosinophil count

Crohn’s disease

IBD

Ulcerative colitist

granulocyte count

lymphocyte count

myeloid white cell count

neutrophil count

None of the LVs have >1 SNPs at FDR<0.2.

plt

None of the LVs have >1 SNPs at FDR<0.2.

rbc

asthma

wbc

effect size plots – more SNPs

For some promising trait-factor pairs (i.e. BMI-LV3, LV27,LV76, RBC-LV82, Asthma-LV68, WBC-LV119, Lymphocyte-LV23, LV78), I relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

I also made a plot to show the the distribution of the SNPs’ fdr.

BMI

Crohn’s disease

The CD~lv88 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count

red blood cell count

The rbc~lv42 and rbc~lv59 pairs are not very promising pairs when considering SNPs at fdr<0.2, but the fitting results at fdr <0.5 are better than the former results. So I post the plots here too.

asthma

white blood cell count

Effect size plots - checking the reverse causality

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

Crohn’s disease

The CD~lv88 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count

red blood cell count

The rbc~lv42 and rbc~lv59 pairs are not very promising pairs when considering SNPs at fdr<0.2, but the fitting results at fdr <0.5 are better than the former results. So I post the plots here too.

asthma

white blood cell count

Results - pval < 5e-8 & association test covariants: 10 PCs + GTEx:Sequencing platform,Sequencing protocol,Sex

Summary table

I used ‘qvalue’ R package to compute the fdr from p-values for each SNP and made a table to show the number of SNPs that pass the threshold. The thresholds are ‘fdr < 0.1’,‘fdr < 0.2’,‘pval < 5e-8’.

Info table

For each trait, I made a table to show the info of snps with fdr>0.2 in the factor ~ SNP + genotype pcs association test. For each trait,The LVs have more than one significant SNPs with FDR<0.2 are included.

The suffix ’_assoc’ here means that results are from factor ~ SNP + genotype pcs association test. The suffix ’_gwas’ here means results are from original GWAS results files. For EUR.CD, EUR.IBD, EUR.UC, the effectsize_gwas here means ‘ln(OR)’, for others, it means ‘beta’.

‘snp_ld’ here means the snps that in LD with the snp in each line.’ld_r2’ means the LD r-squared which is corresponding to the ‘snp_ld’ column. ‘cis-eqtl’ column indicates whether the snp is a cis-eqtl according to GTEx data. ‘cis_gene_hgnc’ and ‘cis_gene_hgnc’ is the genes that the snp influence when it act as cis-eqtl. ‘func’ and ‘func_gene’ are obtained from ANNOVAR, which indicating the snp function within the genes.

Effect size plots

For each trait, I made a plot of association with LV(indicating by beta in GWAS) vs association with trait(indicating by ln(odds ratio) or beta in GWAS) to show if the variants have the correlated effect direction. The effect sizes of Catalog GWAS and factor association tests are harmonized by TwoSampleMR R package to make the effect alleles in these two analysis identical. The LVs have more than one significant SNPs with FDR<0.2 are included in the plotting.Besides, for each plots, I fitted the points with intercept = 0. The pvalues and r-squared are shown on the plots.

BMI

Eosinophil count

Crohn’s disease

IBD

Ulcerative colitist

granulocyte count

None of the LVs have >1 SNPs at FDR<0.2.

lymphocyte count

myeloid white cell count

neutrophil count

plt

rbc

T2D

None of the LVs have >1 SNPs at FDR<0.2.

asthma

wbc

Effect size plots- more SNPs

For some promising trait-factor pairs , I relaxed the fdr threshold of the SNPs that used to make effect size plots(from 0.2 to 0.3/0.5)

BMI

The BMI~lv90 is not very promising pairs when considering SNPs at fdr<0.2, but the fitting result at fdr <0.5 is better than the former result . So I post the plots here too.

lymphocyte count

red blood cell count

asthma

white blood cell count

Effect size plots - checking the reverse causality

To check if the effect size correlation is due to reverse causality: i.e. trait -> LV (trait causally affect LV), instead of LV -> trait (which is what we like to see). I used all SNPs associated with traits(pval<5E-8). The x-axis is the effects of these SNPs on trait, and y-axis is the effects on LV.

Some pair show p < 0.05, the result may be driven by the possible causal effect of LV -> trait. To test this, I removed the SNPs that are associated with LVs at FDR < 0.2 and made the plots again.

BMI

lymphocyte count

red blood cell count

asthma

white blood cell count


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.6.2

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5        rstudioapi_0.11   whisker_0.3-2     knitr_1.30       
 [5] magrittr_1.5      R6_2.4.1          rlang_0.4.8       highr_0.8        
 [9] stringr_1.4.0     tools_3.6.1       DT_0.15           xfun_0.18        
[13] git2r_0.26.1      crosstalk_1.1.0.1 htmltools_0.5.0   ellipsis_0.3.1   
[17] rprojroot_1.3-2   yaml_2.2.1        digest_0.6.25     tibble_3.0.3     
[21] lifecycle_0.2.0   crayon_1.3.4      later_1.1.0.1     htmlwidgets_1.5.2
[25] vctrs_0.3.4       promises_1.1.1    fs_1.5.0          glue_1.4.2       
[29] evaluate_0.14     rmarkdown_1.13    stringi_1.5.3     compiler_3.6.1   
[33] pillar_1.4.6      backports_1.1.10  jsonlite_1.7.1    httpuv_1.5.1     
[37] pkgconfig_2.0.3